Skip to content

preparations to remove context from repo struct #33893

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 29 commits into
base: main
Choose a base branch
from

Conversation

TheFox0x7
Copy link
Contributor

@TheFox0x7 TheFox0x7 commented Mar 14, 2025

first round of adding contexts to functions requiring it


trying keep it small'ish

@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Mar 14, 2025
@github-actions github-actions bot added modifies/api This PR adds API routes or modifies them modifies/go Pull requests that update Go code labels Mar 14, 2025
@TheFox0x7 TheFox0x7 changed the title remove context from repo struct preparations to context from repo struct Mar 16, 2025
@TheFox0x7 TheFox0x7 marked this pull request as ready for review March 16, 2025 14:21
@GiteaBot GiteaBot added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Mar 16, 2025
@TheFox0x7 TheFox0x7 changed the title preparations to context from repo struct preparations to remove context from repo struct Mar 16, 2025
@wxiaoguang
Copy link
Contributor

Is it clear that the change is safe? If the ctx is abused, there would be deadlock problems.

For example: in CatFileBatchCheck, although there is a ctx parameter, actually it also uses "cached" repo.check created by previous ctx.

@wxiaoguang wxiaoguang marked this pull request as draft March 17, 2025 01:26
@wxiaoguang
Copy link
Contributor

wxiaoguang commented Mar 17, 2025

There are already some deadlock problems like this

So there could be 3 results of the "ctx" refactoring:

  • Will fix the deadlock problems: good end but we don't know how/why it fixes (unclear ....)
  • Leave the deadlock problems there: not good or bad, we still need to try to fix the deadlock problems.
  • Introduce more deadlock problems: bad end, things become worse

So ideally I think we should figure out and fix the deadlock problems before introducing unclear changes, and only introduce clear changes with necessary tests.


The first clear and necessary change in my mind is to clarify the func (repo *Repository) CatFileBatchCheck behavior: make sure the ctx won't be abused and the BatchCheck won't use wrong cached ctx.

@TheFox0x7
Copy link
Contributor Author

As far as I know this brings no functional change, it just changes/how when the context gets passed from: request -> open gitRepo with context -> function, to request -> open gitRepo -> function with context.
And it's far from being finished but I don't want to do it all in one go as few functions will have a 100+ line change on their own so it'll start being unreadable.


For whatever reason it took me a far bit to understand what exactly you meant but I see what you're talking about now.
If the cat-file is meant to be used request wide (as I think it's used now) then primarily the request context should handle it's lifecycle.

// Note those are very basic and don't represent real operations
func SomeEndpoint(ctx *context.Context){
gitRepo,err:=git.openRepository(ctx.repo)

...:=gitRepo.SomeGitFunction(ctx)

func (repo ...) SomeGitFunction(ctx,...){
// It uses CatFileBatch underneath
...:=repo.CatFileBatch(ctx) // request scoped so might be stuck of a lot of data gets requested
}

If it's meant to more usecase specific one I'd opt for local one but that's more wasteful probably:

func SomeEndpoint(ctx *context.Context){
gitRepo,err:=git.openRepository(ctx.repo)

...:=gitRepo.SomeGitFunction(ctx)

func (repo ...) SomeGitFunction(ctx,...){
ctx,cancel:=context.WithCancel(ctx)
defer cancel()
// It uses CatFileBatch underneath
...:=repo.CatFileBatch(ctx) // as soon as it's useless it's removed
}

I'll look into it more though input is welcome.
Few loose thoughts:
Why is the output buffered (twice)?
Maybe channels for IO would be better here? Or req/reply (I've spent too much time with nats lately and it show)

@TheFox0x7
Copy link
Contributor Author

The first clear and necessary change in my mind is to clarify the func (repo *Repository) CatFileBatchCheck behavior: make sure the ctx won't be abused and the BatchCheck won't use wrong cached ctx.

Can you clarify what I can do to help with it?
I think both batch functions should use request context when possible and cancellable ones otherwise but this is pretty much expected.
I'm also not sure why is there a temporary version when the cached one is busy...

@wxiaoguang
Copy link
Contributor

Maybe it needs a complete refactoring to make these functions overall right.

TBH I don't quite understand the old design either. By reading the commit history briefly, I can see that at the beginning there was no "ctx", so the "cached batch checker" was right. Then, "ctx" support was added, then more and more places started using "ctx", then the "cached batch checker with ctx" might go wrong. There are still many technical debts to pay. 🤷‍♂️

IIRC (just FYI) there were other "ctx" patches Use request timeout for git service rpc (#20689), Make git batch operations use parent context timeout instead of default timeout (#26325) due to various legacy ctx abuses. And #26325 is more or less related to this discussion & refactoring: CatFileBatchCheck


I'm also not sure why is there a temporary version when the cached one is busy...

No idea, either some people did too much "defensive programming" and added unnecessary design, or there is a real case that it would happen in real world. 🤷‍♂️

@lunny
Copy link
Member

lunny commented Mar 19, 2025

I created #33934 to try to rework CatFile implementation.

@wxiaoguang
Copy link
Contributor

Thank you for the PR. Some updates here:

  1. The deadlock problem mentioned above attr-check: context deadline exceeded #31600 has been resolved, it was caused by wrong return value of Write interface function.
  2. When I was working on some PRs like Add API endpoint to request contents of multiple files simultaniously #34139 , I found that the "git ctx" problem is more complex than I thought ...
    • the "gitRepo" and "ctx" is stored in "commit" directly and passed everywhere, and even some "commit cache (LastCommitCache)"
    • to make the "ctx" clean, maybe we need to refactor from deep to shallow (these "commit" and "cache" first, but not the parent contexts)

@TheFox0x7
Copy link
Contributor Author

Can you point me in the more specific direction here? I don't see at a glance where ctx is stored in commit or commit cache - except indirectly via Repository struct.

But overall my approach was more of finding where the embedded context was used and surfacing it. If there's a saner place to start with (which is not removing ctx from repository outright) I don't mind.

@wxiaoguang
Copy link
Contributor

The problematic design I just found is:

  • LastCommitCache -> git.Repository -> ctx
  • LastCommitCache -> git.Commit -> git.Tree -> git.Repository -> ctx

(ps: I just found it, I haven't really look into it or fully understood how the LastCommitCache works)

@TheFox0x7
Copy link
Contributor Author

So both of those are basically the same issue IMO - repository has ctx in it which it shouldn't have. We could either rework how the cache works so it doesn't carry repository reference - or continue this PR (in more PRs because it can become spaghetti to review fast) to remove the root case of all this - the ctx in repository struct.

I haven't looked at how commit cache works either.

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Apr 25, 2025

So both of those are basically the same issue IMO - repository has ctx in it which it shouldn't have.

Yep, similar, “Commit" is the main propagator.

We could either rework how the cache works so it doesn't carry repository reference - or continue this PR (in more PRs because it can become spaghetti to review fast) to remove the root case of all this - the ctx in repository struct.

TBH I think we need to refactor from deep to shallow (these "commit" and "cache" first, but not the parent contexts)

If we start the refactoring from the "parent functions", it's unclear whether the deep ctx users would abuse the wrong ctx and cause more problems.

So maybe we should start the refactoring from "Commit" first (updated below), to make sure the result of each change we are certain about, and make sure the ctx won't be abused.

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Apr 25, 2025

Some more thoughts (might not be right, just FYI).

It seems that the CatFileBatch and LastCommitCache are two blockers and they are the unclear parts of the refactoring. Maybe we need to clarify their behaviors and ctx usages first. As long as we can eliminate their "cache" behavior, it should be safe to search&replace other ctx usages (just like this PR does)

@TheFox0x7
Copy link
Contributor Author

Some loose thoughts too:
Since LastCommitCache is stored in repository already, it doesn't need to store that reference at all. It's already tied to a repo so we could just pass required items by func argument (or an entire repo reference if for some reason that's needed)

That cuts the LastCommitCache -> Repo -> Ctx embedding leaving leaving Repo (with LastCommitCache) -> Ctx.

Also LastCommitCache doesn't use ctx by itself unless I missed it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm/need 1 This PR needs approval from one additional maintainer to be merged. modifies/api This PR adds API routes or modifies them modifies/go Pull requests that update Go code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants